19 research outputs found
Unifying Sparsest Cut, Cluster Deletion, and Modularity Clustering Objectives with Correlation Clustering
Graph clustering, or community detection, is the task of identifying groups
of closely related objects in a large network. In this paper we introduce a new
community-detection framework called LambdaCC that is based on a specially
weighted version of correlation clustering. A key component in our methodology
is a clustering resolution parameter, , which implicitly controls the
size and structure of clusters formed by our framework. We show that, by
increasing this parameter, our objective effectively interpolates between two
different strategies in graph clustering: finding a sparse cut and forming
dense subgraphs. Our methodology unifies and generalizes a number of other
important clustering quality functions including modularity, sparsest cut, and
cluster deletion, and places them all within the context of an optimization
problem that has been well studied from the perspective of approximation
algorithms. Our approach is particularly relevant in the regime of finding
dense clusters, as it leads to a 2-approximation for the cluster deletion
problem. We use our approach to cluster several graphs, including large
collaboration networks and social networks
On the Optimal Recovery of Graph Signals
Learning a smooth graph signal from partially observed data is a well-studied
task in graph-based machine learning. We consider this task from the
perspective of optimal recovery, a mathematical framework for learning a
function from observational data that adopts a worst-case perspective tied to
model assumptions on the function to be learned. Earlier work in the optimal
recovery literature has shown that minimizing a regularized objective produces
optimal solutions for a general class of problems, but did not fully identify
the regularization parameter. Our main contribution provides a way to compute
regularization parameters that are optimal or near-optimal (depending on the
setting), specifically for graph signal processing problems. Our results offer
a new interpretation for classical optimization techniques in graph-based
learning and also come with new insights for hyperparameter selection. We
illustrate the potential of our methods in numerical experiments on several
semi-synthetic graph signal processing datasets.Comment: This paper has been accepted by 14th International conference on
Sampling Theory and Applications (SampTA 2023
Densest Subhypergraph: Negative Supermodular Functions and Strongly Localized Methods
Dense subgraph discovery is a fundamental primitive in graph and hypergraph
analysis which among other applications has been used for real-time story
detection on social media and improving access to data stores of social
networking systems. We present several contributions for localized densest
subgraph discovery, which seeks dense subgraphs located nearby a given seed
sets of nodes. We first introduce a generalization of a recent
problem, extending this previous objective
to hypergraphs and also adding a tunable locality parameter that controls the
extent to which the output set overlaps with seed nodes. Our primary technical
contribution is to prove when it is possible to obtain a strongly-local
algorithm for solving this problem, meaning that the runtime depends only on
the size of the input set. We provide a strongly-local algorithm that applies
whenever the locality parameter is at least 1, and show why via counterexample
that strongly-local algorithms are impossible below this threshold. Along the
way to proving our results for localized densest subgraph discovery, we also
provide several advances in solving global dense subgraph discovery objectives.
This includes the first strongly polynomial time algorithm for the densest
supermodular set problem and a flow-based exact algorithm for a densest
subgraph discovery problem in graphs with arbitrary node weights. We
demonstrate the utility of our algorithms on several web-based data analysis
tasks